Quality Classifiers for Open Source Software Repositories
نویسندگان
چکیده
Open Source Software (OSS) often relies on large repositories, like SourceForge, for initial incubation. The OSS repositories offer a large variety of meta-data providing interesting information about projects and their success. In this paper we propose a data mining approach for training classifiers on the OSS metadata provided by such data repositories. The classifiers learn to predict the successful continuation of an OSS project. The ‘successfulness’ of projects is defined in terms of the classifier confidence with which it predicts that they could be ported in popular OSS projects (such as FreeBSD, Gentoo Portage).
منابع مشابه
OSSMETER: Automated Measurement and Analysis of Open Source Software
Deciding whether an open source software (OSS) meets the required standards for adoption in terms of quality, maturity, activity of development and user support is not a straightforward process. It involves analysing various sources of information, including the project’s source code repositories, communication channels, and bug tracking systems. OSSMETER extends state-of-the-art techniques in ...
متن کاملSoftware Quality Assessment of Open Source Software
The open source software ecosystem comprises more than a hundred thousand applications of varying quality. Individuals and organizations wishing to use open source software packages have scarce objective data to evaluate their quality. However, open source development projects by definition allow anybody to read, and therefore evaluate their source code. In addition, most projects also publish ...
متن کاملSimplification of Training Data for Cross-Project Defect Prediction
Cross-project defect prediction (CPDP) plays an important role in estimating the most likely defect-prone software components, especially for new or inactive projects. To the best of our knowledge, few prior studies provide explicit guidelines on how to select suitable training data of quality from a large number of public software repositories. In this paper, we have proposed a training data s...
متن کاملComparative Analysis of Software Repository Metrics in BioPerl, BioJava and BioRuby
The open source programming languages, often with a biosuffix, i.e. BioPerl, BioJava, and BioRuby, have been widely used in bioinformatics and computational biology research. The computational tools written in these languages provide multiple functionalities as the languages make them flexible to create customized analysis and examination of biological data. In this paper, we investigate one of...
متن کاملAn Empirical Study on Design Pattern Usage on Open-Source Software
Currently, open source software communities are thriving and the number of projects that are available through well known code repositories is rapidly increasing over the years. The amount of code that is freely available to developers facilitates high reuse opportunities. One of the major concerns of developers when reusing code is the quality of the code that is going to be reused. Design pat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009